Search CORE

27 research outputs found

Standardization of a Communication Middleware for High-Performance Real-Time Systems

Author: Kanevsky Arkady
Skjellum Anthony
Watts Jerrell
Publication venue: SURFACE at Syracuse University
Publication date: 01/01/1997
Field of study

The last several years saw an emergence of standardization activities for real-time systems including standardization of operating systems (series of POSIX standards [1]), of communication for distributed (POSIX.21 [10]) and parallel systems (MPI/RT [5]) and real-time object management (realtime CORBA [9]). This article describes the ongoing standardization work and implementation of communication middleware for high performance real-time computing. The real-time message passing interface (MPI/RT) advances the non-real-time high-performance communication standard Message Passing Interface Standard (MPI), emphasizing changes that enable and support real-time communication, and is targeted for embedded, fault-tolerant and other real-time systems. MPI/RT is the only communication middleware layer that provides guaranteed quality of service and timeliness for data transfers, is also targeted for real-time CORBA to replace RPC layer and for real-time and embedded JAVAs

Syracuse University Research Facility and Collaborative Environment

Dynamic Management Of Heterogeneous Resources

Author: Rieffel Marc
Taylor Stephen
Watts Jerrell
Publication venue: SURFACE at Syracuse University
Publication date: 01/01/1998
Field of study

This paper presents techniques for dynamic load balancing in heterogeneous computing environments. That is, the techniques are designed for sets of machines with varying processing capabilities and memory capacities. These methods can also be applied to homogenous systems in which the effective compute speed or memory availability is reduced by the presence of other programs running outside the target computation. To handle heterogeneous systems, a precise distinction is made between an abstract quantity of work, which might be measured as the number of iterations of a loop or the count of some data structure, and the utilization of resources, measured in seconds of processor time or bytes of memory, required by that work. Once that distinction is clearly drawn, the modifications to existing load balancing techniques are fairly straight-forward. The effectiveness of the resulting load balancing system is demonstrated for a large-scale particle simulation on a network of heterogeneous PC’s, workstations and multiprocessor servers

Syracuse University Research Facility and Collaborative Environment

Building a High-Performance Collective Communication Library

Author: Barnett Mike
Gupta Satya
Payne David G.
Shuler Lance
van de Geijn Robert
Watts Jerrell
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/1994
Field of study

We report on a project to develop a unified approach for building a library of collective communication operations that performs well on a cross-section of problems encountered in real applications. The target architecture is a two-dimensional mesh with worm-hole routing, but the techniques are more general. The approach differs from traditional library implementations in that we address the need for implementations that perform well for various sized vectors and grid dimensions, including non-power-of-two grids. We show how a general approach to hybrid algorithms yields performance across the entire range of vector lengths. Moreover, many scalable implementations of application libraries require collective communication within groups of nodes. Our approach yields the same kind of performance for group collective communication. Results from the Intel Paragon system are included

Crossref

Caltech Authors

Building a high-performance collective communication library

Author: David G. Payne
Jerrell Watts
Lance Shuler
Mike Barnett
Robert van de Geijn
Satya Gupta
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 01/01/2004
Field of study

Crossref

A Practical Approach to Dynamic Load Balancing

Author: Watts Jerrell
Publication venue
Publication date: 01/01/1995
Field of study

No abstract

CiteSeerX

Caltech Authors

Caltech Theses and Dissertations

Dynamic load balancing and granularity control on heterogeneous and hybrid architectures

Author: Watts Jerrell R.
Publication venue
Publication date: 01/01/1998
Field of study

The past several years have seen concurrent applications grow increasingly complex, as the most advanced techniques from academia find their way into production parallel applications. Moreover, the platforms on which these concurrent computations now execute are frequently heterogeneous networks of workstations and shared-memory multiprocessors, because of their low cost relative to traditional large-scale multicomputers. The combination of sophisticated algorithms and more complex computing environments has made existing load balancing techniques obsolete. Current methods characterize the loads of tasks in very simple terms, often fail to account for the communication costs of an application, and typically consider computational resources to be homogeneous. The complexity of current applications coupled with the fact that they are running in heterogeneous environments has also made partitioning a problem for concurrent execution an ordeal. It is no longer adequate to simply divide the problem into some number of pieces per computer and hope for the best. In a complex application, the workloads of the pieces, which may be equal initially, may diverge over time. On a heterogeneous network, the varying capabilities of the computers will widen this disparity in resource usage even further. Thus, there is a need to dynamically manage the granularity of an application, repartitioning the problem at runtime to correct inadequacies in the original partitioning and to make more effective use of computational resources. This thesis presents techniques for dynamic load balancing in complex irregular applications. Advances over previous work are three-fold: First, these techniques are applicable to networks comprised of heterogeneous machines, including both single- processor workstations and personal computers, and multiprocessor compute servers. Second, the use of load vectors more accurately characterizes the resource requirements of tasks, including the computational demands of different algorithmic phases as well as the needs for other resources, such as memory. Finally, runtime repartitioning adjusts the granularity of the problem so that the available resources are more fully utilized. Two other improvements over earlier techniques include improved algorithms for determining the ideal redistribution of work as well as advanced techniques for selecting which tasks to transfer to satisfy those ideals. The latter algorithms incorporate the notion of task migration costs, including the impact on an application's communications locality. The improvements listed above are demonstrated on both industrial applications and small parametric problems on networks of heterogeneous computers as well as traditional large-scale multicomputers

Caltech Theses and Dissertations

A Practical Approach to Dynamic Load Balancing

Author: Jerrell Watts
Stephen Taylor
Publication venue
Publication date: 01/01/1998
Field of study

This paper presents a cohesive, practical load balancing framework that improves upon existing strategies. These techniques are portable to a broad range of prevalent architectures, including massively parallel machines such as the Cray T3D/E and Intel Paragon, shared memory systems such as the SGI PowerChallenge, and networks of workstations. As part of the work, an adaptive heat diffusion scheme is presented as well as a task selection mechanism that can preserve or improve communication locality. Unlike many previous efforts in this arena, the techniques have been applied to two large-scale industrial applications on a variety of multicomputers. In the process, this work exposes a serious deficiency in current load balancing strategies, motivating further work in this area. Keywords: Dynamic load balancing, diffusion, massively parallel computing, irregular problems 1 Introduction A number of trends in computational science and engineering have increased the need for effective dyn..

CiteSeerX

Practical Dynamic Load Balancing for Irregular Problems

Author: Jerrell Watts
Marc Rieffel
Stephen Taylor
Publication venue
Publication date: 01/01/1996
Field of study

. In this paper, we present a cohesive, practical load balancing framework that addresses many shortcomings of existing strategies. These techniques are portable to a broad range of prevalent architectures, including massively parallel machines such as the Cray T3D and Intel Paragon, shared memory systems such as the SGI Power Challenge, and networks of workstations. This scheme improves on earlier work in this area and can be analyzed using well-understood techniques. The algorithm operates using nearest-neighbor communication and inherently maintains existing locality in the application. A simple software interface allows the programmer to use load balancing with very little effort. Unlike many previous efforts in this arena, the techniques have been applied to large-scale industrial applications, one of which is described herein. 1 Introduction A number of trends in computational science and engineering have increased the need for effective dynamic load balancing techniques. In par..

CiteSeerX

A Load Balancing Technique for Multiphase Computations

Author: Jerrell Watts
Marc Rieffel
Stephen Taylor
Publication venue
Publication date: 01/01/1997
Field of study

Parallel computations comprised of multiple, tightly interwoven phases of computation may require a different approach to dynamic load balancing than single-phase computations. This paper presents a load sharing method based on the view of load as a vector, rather than as a scalar. This approach allows multiphase computations to achieve higher efficiency on large-scale multicomputers than possible with traditional techniques. Results are presented for two large-scale particle simulations running on 128 nodes of an Intel Paragon and on 256 processors of a Cray T3D, respectively. INTRODUCTION Load balancing techniques already in the literature have concentrated entirely on single-phase computations (Boillat 1990; Cybenko 1989; Evans and Butt 1993; Heirich and Taylor 1995; Horton 1993; Kohring 1995; Lin and Keller 1987; Muniz and Zaluska 1995; Song 1994; Walshaw and Berzins 1995; Watts et al. 1996; Willebeek-LeMair and Reeves 1993; Williams 1991; Xu and Lau 1997). That is, they work only ..

CiteSeerX

Syracuse University Research Facility and Collaborative Environment